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On  Channel  Estimation  Using  Superimposed  Training 

and  First-Order  Statistics 
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Abstract — Channel  estimation  for  single-input  multiple-output 
(SIMO)  time-invariant  channels  is  considered  using  only  the  first- 
order  statistics  of  the  data.  A  periodic  (nonrandom)  training  se¬ 
quence  is  added  (superimposed)  at  a  low  power  to  the  information 
sequence  at  the  transmitter  before  modulation  and  transmission. 
Recently  superimposed  training  has  been  used  for  channel  estima¬ 
tion  assuming  no  mean-value  uncertainty  at  the  receiver  and  using 
periodically  inserted  pilot  symbols.  We  propose  a  different  method 
that  allows  more  general  training  sequences  and  explicitly  exploits 
the  underlying  cyclostatlonary  nature  of  the  periodic  training  se¬ 
quences.  We  also  allow  mean-value  uncertainty  at  the  receiver.  Il¬ 
lustrative  computer  simulation  examples  are  presented. 

Index  Terms — Channel  estimation,  superimposed  training. 


I.  Introduction 

CONSIDER  an  single-input  multiple-output  (SIMO) 
finite-impulse  response  (FIR)  linear  channel  with  N 
outputs.  Let  {s(n)}  denote  a  scalar  sequence  which  is  input 
to  the  SIMO  channel  with  discrete-time  impulse  response 
{h(Z)}.  The  vector  channel  may  be  the  result  of  multiple 
receive  antennas  and/or  oversampling  at  the  receiver.  Then  the 
symbol-rate,  channel  output  vector  is  given  by 

L 

x(n)  :=  ^^h(Z)s(n  —  Z).  (1) 

/=o 

The  noisy  measurements  of  x(n)  are  given  by 

y(n)  =  x(n)  +  v(n).  (2) 

A  main  objective  in  communications  is  to  recover  s(n)  given 
noisy  {x(rr)}.  In  several  approaches  this  requires  knowledge 
of  the  channel  impulse  response  [3],  [5].  In  training-based  ap¬ 
proach,  s{n)  =  c{n)  =  training  sequence  (known  to  the  re¬ 
ceiver)  for  (say)  n  =  1, 2, . . . ,  M  and  s(n)  for  n  >  M  is  the 
information  sequence  (unknown  apriori  to  the  receiver)  [3],  [5]. 
Therefore,  given  c{n)  and  corresponding  noisy  x(rr),  one  es¬ 
timates  the  channel  via  least-squares  and  related  approaches. 
For  time-varying  channels,  one  has  to  send  training  signal  fre- 
quendy  and  periodically  to  keep  up  with  the  changing  channel. 
This  wastes  resources.  An  alternative  is  to  estimate  the  channel 
based  solely  on  noisy  x(n)  exploiting  statistical  and  other  prop¬ 
erties  of  {s(n)}[3],  [5].  This  is  the  blind  channel  estimation 
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approach.  More  recently,  [1]  and  [2]  have  explored  a  superim¬ 
posed  training  based  approach  for  time-invariant  systems  where 
one  takes  s{n)  =  c{n)  +  h{n),  {Z)(n)}  is  the  information  se¬ 
quence  and  {c(n)}  is  a  nonrandom  periodic  training  (pilot)  se¬ 
quence.  Exploitation  of  the  periodicity  of  {c(rr)}  allows  identi¬ 
fication  of  the  channel  without  allocating  any  explicit  time  slots 
for  training,  unlike  traditional  training  methods.  There  is  no  loss 
in  information  rate.  On  the  other  hand,  some  useful  power  is 
wasted  in  superimposed  training  which  could  have  otherwise 
been  allocated  to  the  information  sequence.  This  lowers  the  ef¬ 
fective  signal-to-noise  ratio  (SNR)  for  the  information  sequence 
and  affects  the  bit  error  rate  (BER)  at  the  receiver. 

Let 

s{n)  =  h{n)  +  c{n)  (3) 

in  (1)  where  {b{n)}  is  the  information  sequence  and  c{n)  is  the 
superimposed  training  sequence.  Let  6{t)  denote  the  Kronecker 
delta,  In  denote  the  N  x  N  identity  matrix  and  the  superscript 
H  denote  the  complex  conjugate  transpose  operation.  Assume 
the  following: 

(HI)  the  information  sequence  {Z)(n)}  is  zero-mean,  white 
with  E{\b(n)\‘^}  =  1; 

(H2)  the  measurement  noise  {v(rr)}  is  nonzero-mean 
(i?{v(rr)}  =  m),  white,  uncorrelated  with  {b{n)}, 
with  E{[v{n  +  r)  —  m][v(rr)  -  m]-^}  =  a‘^lN6{T). 
The  mean  vector  m  is  unknown; 

(H3)  the  superimposed  training  sequence  c{n)  =  c(n  + 
mP)  Vm,  rr  is  a  nonrandom  periodic  sequence  with  pe¬ 
riod  P. 

Reference  [1]  uses  the  second-order  statistics  of  the  received 
signal  to  estimate  the  channel  whereas  [2]  exploits  the  first-order 
statistics.  As  in  [2]  we  will  exploit  the  first-order  statistics  of  the 
received  signal.  (A  consequence  of  using  the  first-order  statis¬ 
tics  is  that  the  knowledge  of  the  noise  variance  in  (H2)  is 
not  used.)  The  corresponding  time-invariant  model  in  [2]  (also 
[1])  does  not  include  an  unknown  constant  term  (d.c.  offset)  in 
the  measurement  equation  [m  in  (H2)] ;  it  should,  however,  if  we 
exploit  E{y(ri)}  to  estimate  the  channel.  In  practice,  linear  sys¬ 
tems  arise  because  of  linearization  about  some  operating  (set) 
point-“bias”  in  amplifiers,  e.g..  These  set  points  are  typically 
unknown  (at  least  not  known  precisely)  a  priori,  and  one  does 
not  normally  worry  about  them  since  unknown  means  are  es¬ 
timated  and  removed  before  processing  (blocked  by  capacitor¬ 
coupling  etc.)  and  they  are  not  needed  in  any  processing.  How¬ 
ever,  if  (time-varying)  mean  E{'y(ri)}  is  what  we  wish  to  use  (as 
in  [2]),  then  we  must  include  a  term  such  as  nonzero  m.  Ref¬ 
erence  [2]  proposes  the  choice  c{n)  =  a6{n  —  kP).  The 
choice  of  [2]  leads  to  a  poor  peak-to-average  power  ratio  of  the 
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transmitted  signal  which  is  highly  undesirable  if  the  transmit 
power  amplifier  has  some  nonlinearity.  In  this  paper  we  follow 
the  basic  ideas  of  [1]  and  [2]  but  propose  a  different  method 
which  works  for  nonzero  m  in  (H2). 


II.  Superimposed  Training-Based  Solution 
By  (l)-(2)  and  (H3),  we  have 

L 

E{y{n)}  =  i?{x(n)}  +  m  =  ^  h{l)c{n  —  Z)  +  m.  (4) 

1=0 

Since  {c(n)}  is  periodic,  we  have  (am  ■=  2'Km/P) 
p-i  ^  p-i 

c(n)  =  ^  cVnC-’"™”  Vn,  Cm  —  ^  (5) 

m=0  n=0 

The  coefficients  Cm’s  are  known  at  the  receiver  since  {c{n)}  is 
known.  We  have 

E{yin)}  = 

m=0 


The  sequence  E{'y(n)}  is  periodic  [4]  with  cycle  frequencies 
am,  0<m<P  —  l.A  mean-square  (m.s.)  consistent  estimate 
dm  of  dm,  for  dm  7^  0,  follows  as  [5] 

1  T 

dm=  y^y(n)e-~.  (7) 

n=l 


As  T  — I  oo,  dm  — >  dm  iti-s.  if  Q!m  7^  0  and  do  ^  do  +  m  m.s. 
if  am  =  0. 

We  now  establish  that  given  dm  for  1  <  m  <  P  —  1,  we 
can  (uniquely)  estimate  h(Z)’s  if  P  >  L  +  2,  am  7^  0,  and 
Cm  7^  0  Vto  ^  0.  Since  m  is  unknown,  we  will  omit  the  term 
TO  =  0  for  further  discussion.  Define 


- 1  e-i«i 
1 


. . . 

g-jO!2L 


1  p-jap-i  .  .  .  p-jap-iL  I 

^  ^  -I  (P-l)x(L-l-l) 

:[h^(0)  h^(l)  •••  h^(P)]'^ 

:[df  df  ...  d?_,f 

:  (diagjci,  C2, . . . ,  cp_i}V)  ®In 

' - V - ' 


where  ®  denotes  the  Kronecker  product  [7,  p.  429].  Omitting 
the  term  to  =  0  and  using  the  definition  of  dm  from  (4),  it 
follows  that 


CH  =  V.  (12) 

In  (8)  V  is  a  Vandermonde  matrix  with  a  rank  of  L+1  if  P—1  > 
L+1  and  cti’s  are  distinct  [6,  p.  274].  Since  Cm  7^  0  Vto,  by  [6, 
Result  R4,  p.  257],  rank(V)  =  rank(V)  =  P  +  1.  Finally,  by 
[7,  Property  K6,  p.  431],  rank(C)  =  rank(V)  X  rank(JAr)  = 
N(L  +  1).  Therefore,  we  can  determine  h(Z)’s  uniquely.  Define 
P  as  in  (12)  with  dm’s  replaced  with  dm’s.  Then  we  have  the 
channel  estimate 

(13) 


Precise  knowledge  of  the  channel  length  L  is  not  required;  an 
upperbound  suffices.  Then  we  estimate  h(7)  for  0  <  i  < 
with  h(7)  ^  0  m.s.  for  2  >  P  -f  1  (=  true  channel  length)  as 
record  length  T  ^  00.  Also,  we  do  not  need  Cm  7^  0  for  every 
TO.  We  need  at  least  P  -f  2  nonzero  Cm’s.  This  can  be  accom¬ 
plished  by  picking  a  “large”  P  and  a  suitable  {c(n)}  (picked  to 
satisfy  a  peak-to-average  power  constraint,  e.g.,).  Implicit  in  our 
approach  (also  in  [1]  and  [2])  is  the  need  at  the  receiver  for  syn¬ 
chronization  with  the  transmitter’s  superimposed  training  se¬ 
quence.  • 

A.  Equalization 

With  h(7)  denoting  the  estimated  h(7)  and  v(n)  :=  v(n)  — 
m,  define 

L  L 

y(n)  :=  y(n)  — ^  h(7)c(n— 7)— rn  «  h(7)s(n— 7)-|-v(n) 

2=0  2=0 

(14) 

where  rn  :=  (1/T)  En=i[yW  -  Ei"=o  -  *)]■  That 

is,  y(n)  is  obtained  by  removing  the  (estimated)  contribution 
of  the  superimposed  training  and  the  dc-offset  from  the  noisy 
data.  Model  (14)  with  the  estimated  channel  is  used  to  equalize 
the  channel  and  to  detect  the  information  sequence.  For  the 
simulations  of  Section  III  we  used  a  linear  MMSE  (minimum 
mean-square  error)  equalizer  which  also  requires  the  knowledge 
of  the  correlation  function  of  y(n).  We  estimate  the  noise  vari¬ 
ance  (see  (H2))  as  (tr{A}  denotes  trace  of  matrix  A) 

^  ^  y{n)y^{n)  -  Y,  h(0h^(*)  >■  (15) 
72=1  _  2=0  J 

(If  (15)  yields  a  negative  result,  we  set  it  to  zero.)  The  correla¬ 
tion  function  of  y(n)  can  then  be  estimated  using  the  estimated 
channel  (instead  of  the  less  reliable  sample  averaging);  only  the 
zero  lag  correlation  requires  a^. 

III.  Simulation  Examples 

A.  Example  1 

Consider  a  continuous-time  channel  h(t)  given  by 
=  Ei=i  o,iPATs{t  ~  'Ti;0.2)  where  T*  is  the  symbol  in¬ 
terval,  (Z;  0.2)  denotes  the  raised-cosine  pulse  with  roll-off 
factor  0.2  and  length  truncated  to  4Ts  (i.e.,  (t-  0.2)  =  0 

for  |Z|  >  2Ts),  the  amplitudes  a^’s  are  mutually  independent, 
zero-mean,  complex  Gaussian  with  same  variance  for  all  Ps, 
and  delays  r^’s  are  mutually  independent,  uniformly  distributed 
over  [0, 4Ts].  The  continuous-time  channel  h{t)  is  sampled 
once  every  Tg  seconds  to  yield  the  discrete-time  channel 
h(n)  :=  h((n  —  l)Ts).  Thus  we  have  ZV  =  1  in  (1)  leading  to 
7 

y(n)  =  E  h(l)[b(n  —  1)  +  c(n  —  1)]  +  v(n).  (16) 

1=0 

Let  Lu  be  the  upper  bound  on  channel  length  L  =  7.  We 
take  Lu  =  10.  The  channel  is  randomly  generated  in  each 
Monte  Carlo.  The  input  information  sequence  {b(n)}  is  i.i.d. 
equiprobable  4-QAM  (quadrature  amplitude  modulation) 
taking  values  (±1  Ej)/\t'2.  The  training  sequence  was  chosen 
to  have  P  =  15  with  c{n)  =  Efc  VE6{n  —  15k)  as  in  [2];  a 
is  picked  to  yield  a  particular  training-to-information  sequence 


n  =  {c^c)-^c^v. 


TUGNAIT  AND  LUO:  ON  CHANNEL  ESTIMATION  USING  SUPERIMPOSED  TRAINING  AND  EIRST-ORDER  STATISTICS 


415 


.-|5  I ^ ^ ^ ^ ^ I ^ ^ ^ ^ ^ ^ ^ I  I ^ ^ ^ ^ ^ ^ ^ I ^ ^ ^ I 

0  2  4  6  8  10  12  14  0  2  4  6  8  10  12  14 
SNR  (dB)  SNR  (dB) 


Fig.  1.  (a)  Example  1:  Normalized  channel  MSE  (17)  based  on  Fig.  2.  Example  2.  (a)  Equalization  performance  using  linear  MMSE 

T  =  150  symbols  per  run,  100  Monte  Carlo  runs,  P  =  15,  TIR  a  =  0.585.  equalizers  based  on  T  =  150  or  300  symbols  per  run,  100  Monte  Carlo  runs, 

DCAC  ratio  =  [E{v(n)}Y  j  (E{\y(n)  —  t)(n)P}).  The  curves  for  P  =  15.  DC  AC  ratio  =  0,  TIR  a  =  0.585.  (b)  Fig.  2(a)  redrawn  with 

the  proposed  method  for  different  DCAC  ratios  are  overlaid  (very  close).  the  curve  for  the  known-channel  linear  MMSE  equalizer  adjusted  by  2  dB  - 

(b)  Normalized  channel  MSE  for  Example  2;  the  rest  as  for  Fig.  1(a).  no  power  is  wasted  in  training. 


power  ratio  (TIR)  a  =  where  ctj  and  denote  the 

average  power  in  the  information  sequence  {6(n)}  and  training 
sequence  {c(n)},  respectively.  Complex  white  zero-mean 
Gaussian  noise  was  added  to  the  received  signal  and  scaled 
to  achieve  an  SNR  at  the  receiver  (relative  to  the  contribution 
of  {s(n)}).  A  mean-value  m  was  added  to  the  noisy  received 
signal  to  achieve  a  specified  dc-offset  to  signal  ac-component 
(DCAC)  power  ratio  m? /{E{\y{n)  —  u(n)p}).  Normalized 
mean-square  error  in  estimating  the  channel  impulse  response 
averaged  over  100  Monte  Carlo  runs,  was  taken  as  the  perfor¬ 
mance  measure  for  channel  estimation.  It  is  defined  as  (before 
Monte  Carlo  averaging) 

NCMSE  := 

The  simulation  results  are  shown  in  Fig.  1(a)  for  various 
SNR’s  and  DCAC  power  ratios  for  a  record  length  of  T  =  150 
symbols  and  a  TIR  of  —2.33  dB  (a  =  0.585).  Our  proposed 
method  and  that  of  [2]  were  simulated.  It  is  seen  that  the 
proposed  method  is  insensitive  to  the  presence  of  the  unknown 
mean  m  whereas  the  method  of  [2]  is  very  sensitive.  For 
m  =  0,  the  performance  of  our  method  is  slightly  inferior  to 
that  of  [2].  In  the  method  of  [2],  /i(/)’s  are  estimated  directly 
from  data  for  l<l<Lu  +  l  =  ll  whereas  in  our  approach, 
we  first  estimate  dm’s  for  l<m<P  —  1  =  14  and  then  use 
(13).  Since  we  estimate  more  variables  (14  versus  1 1),  this  may 
account  for  the  slightly  inferior  performance  of  our  method  for 
m  =  0. 

B.  Example  2 

This  example  is  exactly  as  Example  1  except  for 
the  training  sequence  which  was  taken  to  be  an  m-se- 
quence  (maximal  length  pseudorandom  binary  sequence) 
of  length  15  (=  P),  c{n)  =  ^/ac{n),  {c(n)}^|_o  = 

{-1,  -1,  -1, 1, 1, 1, 1,  -1, 1,  -1, 1, 1,  -1,  -1, 1}.  The 

peak-to-average  power  ratio  for  this  sequence  is  one  (the  best 


.  (17) 


possible).  The  simulation  results  are  shown  in  Fig.  1(b)  for  a 
record  length  of  T  =  150  symbols  and  a  TIR  of  —2.33  dB 
(a  =  0.585).  Only  our  proposed  method  was  simulated  since 
the  method  of  [2]  does  not  apply  to  this  model.  It  is  seen  that 
as  in  Example  1,  the  proposed  method  is  insensitive  to  the 
presence  of  the  unknown  mean  m.  Equalization  performance 
(BER)  of  a  linear  MMSE  equalizer  based  on  the  estimated 
channel  (Example  2)  is  shown  in  Eig.  2(a)  for  two  different 
record  lengths  of  T  =  150  and  300  symbols.  The  linear 
equalizer  was  designed  as  noted  in  Section  II  with  equalizer 
length  of  10  symbols  and  delay  of  five  symbols.  Also  shown 
is  the  performance  of  a  linear  equalizer  based  upon  perfect 
knowledge  of  the  channel  and  noise  variance.  It  is  seen  that 
the  performance  improves  with  record  length.  Note  that  for  our 
choice  of  a  =  0.585,  the  SNR  relative  to  {6(n)}  would  be 
2  dB  less  than  the  SNR  shown  in  Eig.  2(a),  which  is  relative 
to  {^(n)}.  To  reflect  this  loss  in  SNR  due  to  inclusion  of  the 
superimposed  training,  we  redraw  Eig.  2(a)  as  Eig.  2(b)  with 
the  SNR  for  the  curve  for  the  known-channel  linear  MMSE 
equalizer  adjusted  by  2  dB. 
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